首页> 外文OA文献 >Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks
【2h】

Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks

机译:硬件 - 软件协调的准确,无乘数的深度神经网络   网络

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

While Deep Neural Networks (DNNs) push the state-of-the-art in many machinelearning applications, they often require millions of expensive floating-pointoperations for each input classification. This computation overhead limits theapplicability of DNNs to low-power, embedded platforms and incurs high cost indata centers. This motivates recent interests in designing low-power,low-latency DNNs based on fixed-point, ternary, or even binary data precision.While recent works in this area offer promising results, they often lead tolarge accuracy drops when compared to the floating-point networks. We propose anovel approach to map floating-point based DNNs to 8-bit dynamic fixed-pointnetworks with integer power-of-two weights with no change in networkarchitecture. Our dynamic fixed-point DNNs allow different radix points betweenlayers. During inference, power-of-two weights allow multiplications to bereplaced with arithmetic shifts, while the 8-bit fixed-point representationsimplifies both the buffer and adder design. In addition, we propose a hardwareaccelerator design to achieve low-power, low-latency inference withinsignificant degradation in accuracy. Using our custom accelerator design withthe CIFAR-10 and ImageNet datasets, we show that our method achievessignificant power and energy savings while increasing the classificationaccuracy.
机译:虽然深度神经网络(DNN)在许多机器学习应用程序中都采用了最新技术,但对于每种输入类别,它们通常都需要数百万美元的昂贵浮点运算。这种计算开销限制了DNN在低功耗嵌入式平台上的适用性,并导致数据中心的高成本。这激发了近来对基于定点,三进制甚至二进制数据精度设计低功耗,低延迟DNN的兴趣。尽管该领域的最新成果提供了可喜的结果,但与浮点运算相比,它们通常会导致精度下降较大点网络。我们提出了anovel方法,将基于浮点的DNN映射到具有2的整数次方权重的8位动态定点网络,而网络体系结构没有变化。我们的动态定点DNN允许在层之间使用不同的小数点。在推论过程中,2的幂次方的权重允许通过算术移位来代替乘法,而8位定点表示简化了缓冲器和加法器设计。此外,我们提出了一种硬件加速器设计,以在精度显着降低的情况下实现低功耗,低延迟的推理。使用我们的定制加速器设计和CIFAR-10和ImageNet数据集,我们证明了我们的方法在节省功率和能源的同时还提高了分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号